Phoneme Dedicated ANN Improves Segmental Duration Model
نویسندگان
چکیده
The Phoneme Dedicated Artificial Neural Network (PDANN) segmental duration model consists of a set of ANNs trained specifically for each phoneme segment in order to avoid miscellaneous influence of different types of phoneme segments. Therefore, each ANN is dedicated to predict the duration of a specific phoneme segment. Objective and subjective measurements of the performance of the PDANN model were compared with those of a typical ANN model using the same input features and database. The results indicate a slight, but clear, perceptually perceived preference towards the PDANN.
منابع مشابه
Use of Phoneme Dedicated Artificial Neural Networks to Predict Segmental Durations
The results of two alternative models to predict segmental durations in speech synthesis, both based on Artificial Neural Networks (ANNs) are discussed. The ANN model consists in just one ANN trained to predict the segmental durations for all phonemes. The phoneme dedicated ANN model consists in a set of ANNs, each one dedicated to predict the segmental duration of a specific phoneme. Both mode...
متن کاملImproving Phoneme Sequence Recognition using Phoneme Duration Information in DNN-HSMM
Improving phoneme recognition has attracted the attention of many researchers due to its applications in various fields of speech processing. Recent research achievements show that using deep neural network (DNN) in speech recognition systems significantly improves the performance of these systems. There are two phases in DNN-based phoneme recognition systems including training and testing. Mos...
متن کاملDuration modeling for hindi text-to-speech synthesis system
This paper reports preliminary results of data-driven modeling of segmental (phoneme) duration for Hindi. Classification and Regression Tree (CART) based datadriven duration modeling for segmental duration prediction is presented. A number of features are considered and their usefulness and relative contribution for segmental duration prediction is assessed. Objective evaluation of the duration...
متن کاملA CART approach for Duration Modeling of Greek Phonemes
This paper describes the construction and evaluation of a segmental duration prediction model for Greek language with the application of CART (Classification and Regression Tree) machine learning approach. A ToBI annotated prosodic speech corpus was utilized for the construction of training and testing sets. Our phoneme category was composed of 34 phonemes distributed in 32.072 instances (in 5....
متن کاملImproved Classification of Speaking Styles for Mental Health Monitoring Using Phoneme Dynamics
This paper investigates the usefulness of segmental phonemedynamics for classification of speaking styles. We modeled transition details based on the phoneme sequences emitted by a speech recognizer, using data obtained from a recording of 39 depressed patients with 7 different speaking styles normal, pressured, slurred, stuttered, flat, slow and fast speech. We designed and compared two set of...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2008